6 research outputs found
Rate-Compatible Polar Codes for Automorphism Ensemble Decoding
Recently, automorphism ensemble decoding (AED) has drawn research interest as
a more computationally efficient alternative to successive cancellation list
(SCL) decoding of polar codes. Although AED has demonstrated superior
performance for specific code parameters, a flexible code design that can
accommodate varying code rates does not yet exist. This work proposes a
theoretical framework for constructing rate-compatible polar codes with a
prescribed automorphism group, which is a key requirement for AED. We first
prove that a one-bit granular sequence with useful automorphisms cannot exist.
However, by allowing larger steps in the code dimension, flexible code
sequences can be constructed. An explicit synthetic channel ranking based on
the -expansion is then proposed to ensure that all constructed codes
possess the desired symmetries. Simulation results, covering a broad range of
code dimensions and blocklengths, show a performance comparable to that of 5G
polar codes under cyclic redundancy check (CRC)-aided SCL decoding, however,
with lower complexity.Comment: 5 pages, 2 figures, submitted to IEEE for possible publicatio
Learning Joint Detection, Equalization and Decoding for Short-Packet Communications
We propose and practically demonstrate a joint detection and decoding scheme
for short-packet wireless communications in scenarios that require to first
detect the presence of a message before actually decoding it. For this, we
extend the recently proposed serial Turbo-autoencoder neural network (NN)
architecture and train it to find short messages that can be, all "at once",
detected, synchronized, equalized and decoded when sent over an unsynchronized
channel with memory. The conceptional advantage of the proposed system stems
from a holistic message structure with superimposed pilots for joint detection
and decoding without the need of relying on a dedicated preamble. This results
not only in higher spectral efficiency, but also translates into the
possibility of shorter messages compared to using a dedicated preamble. We
compare the detection error rate (DER), bit error rate (BER) and block error
rate (BLER) performance of the proposed system with a hand-crafted
state-of-the-art conventional baseline and our simulations show a significant
advantage of the proposed autoencoder-based system over the conventional
baseline in every scenario up to messages conveying k = 96 information bits.
Finally, we practically evaluate and confirm the improved performance of the
proposed system over-the-air (OTA) using a software-defined radio (SDR)-based
measurement testbed.Comment: Submitted to IEEE TCO
Component Training of Turbo Autoencoders
Isolated training with Gaussian priors (TGP) of the component autoencoders of
turbo-autoencoder architectures enables faster, more consistent training and
better generalization to arbitrary decoding iterations than training based on
deep unfolding. We propose fitting the components via extrinsic information
transfer (EXIT) charts to a desired behavior which enables scaling to larger
message lengths () while retaining competitive performance. To
the best of our knowledge, this is the first autoencoder that performs close to
classical codes in this regime. Although the binary cross-entropy (BCE) loss
function optimizes the bit error rate (BER) of the components, the design via
EXIT charts enables to focus on the block error rate (BLER). In serially
concatenated systems the component-wise TGP approach is well known for inner
components with a fixed outer binary interface, e.g., a learned inner code or
equalizer, with an outer binary error correcting code. In this paper we extend
the component training to structures with an inner and outer autoencoder, where
we propose a new 1-bit quantization strategy for the encoder outputs based on
the underlying communication problem. Finally, we discuss the model complexity
of the learned components during design time (training) and inference and show
that the number of weights in the encoder can be reduced by 99.96 %.Comment: Submitted to ISTC 2023,5 page
A Polar Subcode Approach to Belief Propagation List Decoding
Permutation decoding gained recent interest as it can exploit the symmetries
of a code in a parallel fashion. Moreover, it has been shown that by viewing
permuted polar codes as polar subcodes, the set of usable permutations in
permutation decoding can be increased. We extend this idea to pre-transformed
polar codes, such as cyclic redundancy check (CRC)-aided polar codes, which
previously could not be decoded using permutations due to their lack of
automorphisms. Using belief propagation (BP)-based subdecoders, we showcase a
performance close to CRC-aided SCL (CA-SCL) decoding. The proposed algorithm
outperforms the previously best performing iterative CRC-aided belief
propagation list (CA-BPL) decoder both in error-rate performance and decoding
latency.Comment: 6 pages, submitted to IEEE for possible publicatio
Efficient FPGA Implementation of an ANN-Based Demapper Using Cross-Layer Analysis
In the field of communication, autoencoder (AE) refers to a system that replaces parts of the traditional transmitter and receiver with artificial neural networks (ANNs). To meet the system performance requirements, it is necessary for the AE to adapt to the changing wireless-channel conditions at runtime. Thus, online fine-tuning in the form of ANN-retraining is of great importance. Many algorithms on the ANN layer are developed to improve the AE’s performance at the communication layer. Yet, the link of the system performance and the ANN topology to the hardware layer is not fully explored. In this paper, we analyze the relations between the design layers and present a hardware implementation of an AE-based demapper that enables fine-tuning to adapt to varying channel conditions. As a platform, we selected field-programmable gate arrays (FPGAs) which provide high flexibility and allow to satisfy the low-power and low-latency requirements of embedded communication systems. Furthermore, our cross-layer approach leverages the flexibility of FPGAs to dynamically adapt the degree of parallelism (DOP) to satisfy the system-level requirements and to ensure environmental adaptation. Our solution achieves 2000× higher throughput than a high-performance graphics processor unit (GPU), draws 5× less power than an embedded central processing unit (CPU) and is 5800× more energy efficient compared to an embedded GPU for small batch size. To the best of our knowledge, such a cross-layer design approach combined with FPGA implementation is unprecedented